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REPLICATION PROTEIN 

This invention relates to a screening method for the identification of agents which 
modulate the activity of a DNA replication protein as a target for intervention in cancer 
therapy and includes agents which modulate said activity. 

The DNA replication initiation process involves assembly of replication proteins into 
higher order complexes inside me nucleus during Ql phase of the cell cycle, followed by 
their activation to begin DNA synthesis (S phase). Intensive study has focused on a few 
proteins largely identified by yeast genetics, that are involved in the replication complex 
assembly process and its regulation (Blow, 2001; Diffley and Labib, 2002). These 
include the origin recognition complex (ORC), Cdc6 and Cdtl, which are components of 
the pre-replication complex, and the Mem proteins, which are believed to be the mam 
rephcative DNA helicase. The production and assembly of these proteins is regulated by 
i cyclin dependent protein kinase 2 (cdk2) and by proteins which impinge on cdk2 
activity, such as the cdk inhibitors p27 tapl and p2 r pl . 

Initiation of DNA replication can now be reconstituted with isolated mammalian nuclei 
and cytosolic extracts (Krude, 2000; Krude et al., 1997; Laman et al., 2001; Stoeber et 
0 al 1998). Replication complex assembly and activation of DNA synthesis have been 
separated and reproduced under the regulation of recombinant cyclin-dependent kinases 
(cdks), using nuclei and extracts derived from Gl phase mouse cells (Coverley et al., 
2002). 

25 The mammalian initiation process can be separated into an assembly phase which is 
positively regulated by cyclin E -cdk2 and negatively regulated by cyclin A-cdk2, and 
an activation stage which is regulated by cyclin A -cdk2 (Coverley et al., 2002). Both 
phases can be reconstituted in vitro and are highly sensitive to recombinant cdk2 
concentration. At the active concentration cyclin E-cdk2 stimulates replication complex 

30 assembly by cooperating with Cdc6, making Gl nuclei competent to replicate in vitro. In 
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contrast, cyclin A-cdk2 has two separable functions with in vitro optima at different 
concentrations: activation of DNA synthesis in replication complexes that are already 
assembled, .and inhibition of assembly of hew complexes. The dual functions of cyclin A 
ensure that the assembly phase (Gl) ends before DNA synthesis (S) begins, thereby 
5 preventing re-initiation until the next cell cycle. 

A number of changes to chromatin bound proteins occur when DNA synthesis is 
activated in vitro by recombinant cyclin A-cdk2. The present invention relates to the 
finding that a cdc6-related antigen, p85, correlates with the initiation of DNA replication 
10 and is regulated by cyclin A-cdk2. The protein was cloned from a mouse embryo library 
and identified as mouse Cizl. 

Human Cizl (Cipl Interacting Zinc-finger protein) was described in 1999 after a two- 
hybrid screen to identify cyclin E/p21 complex interacting proteins. Mitsui et al (Mitsui 
15 et al., 1999) showed that Cizl interacts with the cdk inhibitor p21 Cipl , but not with cyclin 
E. No analysis of Cizl function was reported, except that a role in transcription was 
sought but not found. 

In vitro analysis has shown that Cizl protein positively regulates initiation of DNA 
20 replication and that its activity is modulated by cdk phosphorylation at threonine 191/2, 
Unking it to the cdk-dependent pathways that control initiation. The transcription factor- 
like features of Cizl are not required for replication function, implying that Cizl has 
more than one function. The Embryonic form mouse Cizl is alternately spliced, 
compared to predicted and somatic forms. Human Cizl is also alternately spliced, with 
25 variability in the same exons as mouse Cizl. It has been found that recombinant 
embryonic form Cizl promotes initiation of mammalian DNA replication and that 
pediatric cancers express 'embryonic-like' forms of Cizl. Without wishing to be held to 
one theory, the inventors propose that Cizl mis-splicing produces embryonic form Cizl 
at inappropriate times in development. This promotes DNA replication and contributes 
30 to formation or progression of cancer cell lineages. . 
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A number of techniques have been developed in recent years which purport to 
specifically ablate genes and/or gene products. For example, the use of anti-sense 
nucleic acid molecules to bind to and thereby block or inactivate target mRNA 
molecules is an effective means to inhibit the production of gene products. 

A much more recent technique to specifically ablate gene function is through the 
introduction of double stranded RNA, also referred to as inhibitory RNA (RNAi), into a 
cell which results in the destruction of mRNA complementary to the sequence included 
in the RNAi molecule. The RNAi molecule comprises two complementary strands of 
RNA (a sense strand and an antisense strand) annealed to each other to form a double 
stranded RNA molecule. The RNAi molecule is typically derived from the exonic or 
coding sequence of the gene which is to be ablated. 

Nucleic acids and proteins have both a linear sequence structure, as defined by their base 
or amino acid sequence, and also a three dimensional structure which in part is 
determined by the linear sequence and also the environment in which these molecules 
are located. Conventional therapeutic molecules are small molecules, for example, 
peptides, polypeptides, or antibodies, which bind target molecules to produce an 
agonistic or antagonistic effect. It has become apparent that nucleic acid molecules also 
have potential with respect to providing agents with the requisite binding properties 
which may have therapeutic utility. These nucleic acid molecules are typically referred 
to as aptamers. Aptamers are small, usually stabilised, nucleic acid molecules which 
comprise a binding domain for a target molecule. 

Aptamers may comprise at least one modified nucleotide base. The term "modified 
nucleotide base" encompasses nucleotides with a covalently modified base and/or sugar. 
For example, modified nucleotides include nucleotides having sugars which are 
covalently attached to low molecular weight organic groups other than a hydroxyl group 
at the 3- position and other than a phosphate group at the 5' position. Thus modified 
nucleotides may also include V substituted sugars such as 2'-0-methyl-; 2-O-alkyl; 2-0- 
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allyl; 2'-S-alkyl; 2'-S-allyl; T- fluoro-; 2'-halo or 2;azido-ribose, carbocyclic sugar 
analogues a-anomeric sugars; epimeric sugars such as arabinose, xyloses or lyxoses, 
pyranose sugars, furanose sugars, and sedoheptulose. 

5 Modified nucleotides are known in the art and include by example and not by way of 
limitation; alkylated purines and/or pyrimidines; acylated purines and/or pyrimidines; 
or other heterocycles. These classes of pyrimidines and purines are known in the art and 
include, pseudoisocytosine; N4, N4-ethanocytosine; 8-hydroxy-N6-methylademne; 4- 
acetylcytosine, 5-(carboxyhydroxylmethyl) uracil; 5-fluorouracil; 5-bromouracil; 5- 

10 carboxymemylaminomethyl-2-thiouracil; 5-carboxymemylarninomethyl uracil; 
dihydrouracil; inosine; N6-isopentyl-adenine; 1-methyladenine; 1-methylpseudouracil; 1- 
methylguanine; 2,2-dMemylguanine; 2-methyladenine; 2-methylguanine; 3- 
methylcytosine; 5-methylcytosine; N6-methyladenine; 7-methylguanine; 5- 
methylaminomethyl uracil; 5-methoxy amino methyl-2-thiouracil; P-D- 

15 mannosylqueosine; 5-methoxycarbonylmethyluracil; 5-melhoxyuracil; 2 methylthio-N6- 
isopentenyladenine; uracil-5-oxyacetic acid methyl ester; psueouracil; 2-thiocytosine; 5- 
methyl-2 thiouracil, 2-thiouracil; 4-thiouracil; 5-methyluracil; N-uracil-5-oxyacetic acid 
methylester; uracil 5 — oxyacetic acid; queosine; 2-thiocytosine; 5-propyluracil; 5- 
propylcytosine; 5-ethyluracil; 5-ethylcytosine; 5-butyluracil; 5-pentyluracil; 5- 

20 pentylcytosine; and 2,6,-diaminopurine; methylpsuedouracil; 1-methylguanine; 1- 
methylcytosine; 

Aptamers may be synthesized using conventional phosphodiester linked nucleotides 
using standard solid or solution phase synthesis techniques which are known in the art 
25 Linkages between nucleotides may use alternative linking molecules. For example, 
linking groups of the formula P(0)S, (thioate); P(S)S, (dithioate); P(0)NR'2; P(0)R'; 
P(0)OR6; CO; or CONR'2 wherein R is H (or a salt) or alkyl (1-12C) and R6 is alkyl 
(1-9C) is joined to adjacent nucleotides through -O- or -S-. 
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Other techniques which purport to specifically ablate genes and/or gene products focus 
on modulating the function or interfering with the activity of protein molecules. 
Proteins can be targeted by chemical inhibitors drawn, for example, from existing small 
molecule libraries. 

5 

Antibodies, preferably monoclonal, can be raised for example in mice or rats against 
• different protein isoforms. Antibodies, also known as immunoglobulins, are protein 
molecules which have specificity for foreign molecules (antigens). Immunoglobulins 
(Ig) are a class of structurally related proteins consisting of two pairs of polypeptide 
10 chains, one pair of light (L) (low molecular weight) chain (k or X), and one pair of heavy 
(H) chains (y, a, n, 8 and s), all four linked together by disulphide bonds. Both H and L 
chains have regions that contribute to the binding of antigen and that are highly variable 
from one Ig molecule to another. In addition, H and L chains contain regions that are 
non-variable or constant. 

15 

The L chains consist of two domains. The carboxy-terminal domain is essentially 
identical among L chains of a given type and is referred to as the "constant" (C) region. 
The amino terminal domain varies from one L chain to anther and contributes to the 
binding site of the antibody. Because of its variability, it is referred to as the 'Variable" 
20 (V) region. 

The H chains of Ig molecules are of several classes, a, u, a, a, and y (of which there are 
several sub-classes). An assembled Ig molecule consisting of one or more units of two 
identical H and L chains, derives its name from the H chain that it possesses. Thus, 
25 there are five Ig isotypes: IgA, IgM, IgD, IgE and IgG (with four sub-classes based on 
the differences in the H chains, i.e., IgGl, IgG2, IgG3 and IgG4). Further detail 
regarding antibody structure and their various functions can be found in, Using 
Antibodies: A laboratory manual, Cold Spring Harbour Laboratory Press. 
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Chimeric antibodies are recombinant antibodies in which all of the V-regions of a mouse 
or rat antibody are combined with human antibody C-regions. Humanised antibodies are 
recombinant hybrid antibodies which fuse the complimentarity determining regions from 
a rodent antibody V-region with the framework regions from the human antibody V- 
regions. The C-regions from the human antibody are also used. The complimentarity 
determining regions (CDRs) are the regions within the N-terminal domain of both the 
heavy and light chain of the antibody to where the majority of the variation of the V- 
region is restricted. These regions form loops at the surface of the antibody molecule. 
These loops provide the binding surface between the antibody and antigen. 



Antibodies from non-human animals provoke an immune response to the foreign 
antibody and its removal from the circulation. Both chimeric and humanised antibodies 
have reduced antigenicity when injected to a human subject because there is a reduced 
amount of rodent (i.e. foreign) antibody within the recombinant hybrid antibody, while 
15 the human antibody regions do not illicit an immune response. This results in a weaker 
immune response and a decrease in the clearance of the antibody. This is clearly 
desirable when using therapeutic antibodies in the treatment of human diseases. 
Humanised antibodies are designed to have less "foreign" antibody regions and are 
therefore thought to be less immunogenic than chimeric antibodies. 



Other techniques for targetting at the protein level include the use of randomly generated 
peptides that specifically bind to proteins, and any other molecules which bind to 
proteins or protein variants and modify the function thereof. 

25 Understanding the DNA replication process is of prime concern in the field of cancer 
therapy. It is known that cancer cells can become resistant to chemotherapeutic agents 
and can evade detection by the immune system. There is an on going need to identify 
targets for cancer therapy so that new agents can be identified. The DNA replication 
process represents a prime target for drug intervention in cancer therapy. There is a 

30 need to identify gene products which modulate DNA replication and which contribute to 
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formation or progression 
function. 



of cancer cell lineages, and to develop agents that affect their 



According to one aspect of the present invention there is provided the use of a 
5 polypeptide with the activity of Ciz 1, or any variant thereof as a target for the 
identification of agents which modulate DNA replication. 

According to an alternative aspect of the invention there is provided a screening method 
for the identification of agents which modulate DNA replication wherein the screening 
10 method comprises the use of Cizl and variants thereof. 

Preferably the screening method comprises the steps of: 

(i) forming a preparation comprising a polypeptide molecule, or an active fragment 
thereof, encoded by a nucleic acid molecule selected from the group consisting of: 

a) a nucleic acid molecule comprising a nucleic acid sequence represented in Fig 
8 a or b; 

b) a nucleic acid molecule which hybridizes to the nucleic acid sequence in (a) 
and which has Cizl activity or activity of a variant thereof; 

c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate 
because of the genetic code to the sequences in a) and b) and a candidate agent to 
be tested; 

d) a nucleic acid molecule derived from the genomic sequence at the Cizl locus 
and 

ii) detecting or measuring the effect of the agent on the activity of said polypeptide. 



Assays for the detection of DNA replication are known in the art. Activity residing in 
Cizl, or derived peptide fragments, and the effect of potential therapeutic agents on that 
activity would be assayed in vitro or in vivo. 
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In vitro assays for Cizl protein activity would comprise synchronised isolated Gl phase 
nuclei and either S phase extract or Gl phase extract supplemented with cyclin- 
dependent kinases. Inclusion of Cizl or derived peptide fragments stimulates initiation 
of DNA replication in these circumstances and can be monitored visually (by scoring 

5 nuclei that have incorporated fluorescent nucleotides during in vitro reactions) or by 
measuring incorporation of radioactive nucleotides. The assay for therapeutic reagents 
that interfere whh Cizl protein function would involve looking for inhibition of DNA 
replication in these assays. The effect of agents on Cizl nuclear .localisation, chromatin 
binding, stability, modification and protein-protein interactions could also be monitored 

10 in these assays. 

In vivo assays will include creation of cell and mouse models that over-express or under- 
express Cizl, or derived fragments, resulting in altered cell proliferation. The 
preparation of transgenic animals is generally known in the art and within the ambit of 
15 me skilled person. The assay for therapeutic reagents would involve analysis of cell- 
cycle time, initiation of DNA replication and cancer incidence in the presence and 
absence of drugs that either impinge on Cizl protein activity, or interfere with Cizl 
production by targeting Cizl and its variants at the RNA level. 

20 In a preferred method of the invention said hybridisation conditions are stringent 

Stringent hybridisation/washing conditions are well known in the art. For example, 
nucleic acid hybrids that are stable after washing in 0.1xSSC,0.1% SDS at 60°C. It is 
well known in the art that optimal hybridisation conditions can be calculated if the 

25 sequence of the nucleic acid is known Typically, hybridisation conditions uses 4 - 6 x 
SSPE (20x SSPE contains 175.3g NaCl, 88.2g NaH 2 P0 4 H 2 0 and 7.4g EDTA dissolved 
to 1 litre and the pH adjusted to 7.4); 5-10x Denhardts solution (50x Denhardts solution 
contains 5g Ficoll (Type 400, Pharmacia), 5g polyvinylpyrroUdone abd 5g bovine serum 
albumen; lOOug-l.Omg/ml sonicated salmon/herring DNA; 0.1-1.0% sodium dodecyl 

30 sulphate; optionally 40-60% deionised formamide. Hybridisation temperature will vary 
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depending on the GC content , of the nucleic acid target sequence but will typically be 
between 42°- 65° C. 

In a preferred method of the invention said polypeptide is modified by deletion, 
5 substitution or addition of at least one amino acid residue of the sequence. 

A modified or variant, i.e. a fragment polypeptide and reference polypeptide, may differ 
in amino acid sequence by one or more substitutions, additions, deletions, truncations 
which may be present in any combination. Among preferred variants are those that vary 

10 from a reference polypeptide by conservative amino acid substitutions. Such 
substitutions are those that substitute a given amino acid by another amino acid of like 
characteristics. The following non-limiting list of amino acids are considered 
conservative replacements (similar): a) alanine, serine, and threonine; b) glutamic acid 
and asparatic acid; c) asparagine and glutamine d) arginine and lysine; e) isoleucine, 

15 leucine, methionine and valine and f) phenylalanine, tyrosine and tryptophan. Preferred 
are variants which retain the same biological function and activity as the reference 
polypeptide from which it varies. Alternatively, variants include those with an altered 
biological function, for example variants which act as antagonists, so called "dominant 
negative" variants. 

20 

Alternatively or in addition, non-conservative substitutions may give the desired 
biological activity see Cain SA, Williams DM, Harris V, Monk PN. Selection of novel 
ligands from a whole-molecule randomly mutated C5a library. Protein Eng. 2001 
Mar; 14(3): 1 89-93, which is incorporated by reference. 

25 

A functionally equivalent polypeptide according to the invention is a variant wherein 
one or more amino acid residues are substituted with conserved or non-conserved amino 
acid residues, or one in which one or more amino acid residues includes a substituent 
group. Conservative substitutions are the replacements, one for another, among the 



aHphatic amino adds Ala, V* Le» and He; change of the hydroxl residues Ser and 
Br exchange of the aoidie residues Asp and G* substitution between amide res,dues 
Asn and Gin; exehange of fae basic residues Lys and Arg; and repla«men,s among 
aromatic residues Phe and Tyr. 

,n addition, me invention feamres polypeptide sequences having at leas, 75% identity 
me polypeptide sequences as hereindisclosed, or Augments and fanchonally 
equivaient peptides thereof. In one embodiment, me peptides have * ta* 
85% identity, more preferab ly a, leas, 90% identity, even more preferably a en* 
identic, stiU more preferab!y a, leas, 97% identity, and mos, Preferably a, leas, 99/. 
identity with the ammo acid sequences illustrated herein. 

to a preferred memod of the invention said nucleic acid molecule comprises <he nucleic 
add sequence encoding me amino acid sequence Cizl in Fig 2 or variants thereof. In a 
tether prefer memod of me invention said nucleic acid molecule cons.ste of the 
nucleic acid sequence which encodes me amino acid sequence Cizl in Fig 2 or vanams 
thereof. 

in a farmer preferreti memod of the invention said polypeptide moleeule comprises the 
anteo acid sequence Cizl in Big 2 or variant thereof. In a farmer preferred mefaod of 
the invention said polypeptide molecule consists of the amino acid sequence Qd m F,g 
2 or variants thereof. 

» a further prefenud memod of the invention said polypeptide is expressed by a eel! 
^ferably a mammalian cell, or animal and said screening memod is a ceU-based 
« screening mefaod. Preferably said cell nateally expresses me Ciz polypeptide. 
Alternatively said call is transfected with a nucleic acid molecule encodmg a 
polypeptide with Ciz activity (or a variant molecule found in cancer cells). 
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According to a further aspect of the invention there is provided an agent obtainable by 
the method according to the invention. 

Preferably said agent is an antagonist of Cizl mediated DNA replication. Alternatively 
5 said agent is an agonist of Cizl mediated DNA replication. 

In a further preferred method of the invention said agent is selected from the group 
consisting of: polypeptide; peptide; aptamer; chemical; antibody; nucleic acid. 

10 Preferably said agent is an anti-sense nucleic acid molecule which binds to and thereby 
blocks or inactivates the mRNA encoded by any of the nucleic acid sequences in (i) 
above. 

In an alternative embodiment, said agent is an RNAi molecule and comprises two 
15 complementary strands of RNA (a sense strand and an antisense strand) annealed to each 
other to form a double stranded RNA molecule. Preferably the RNAi molecule is 
derived from the exonic sequence of the Cizl gene or from another over-lapping gene. 

In one embodiment unspliced mRNA is targetted with with RNAi to inhibit production 
20 of the spliced variant. In another the spliced variant mRNA is ablated without affecting 
the non-variant mRNA. 

In a preferred method of the invention said peptide is an oligopeptide. Preferably, said 
oligopeptide is at least 10 amino acids long. Preferably said oligopeptide is at least 20, 
25 30, 40, 50 amino acids in length. 

In a further preferred method of the invention said peptide is a modified peptide. 

It will be apparent to one skilled in the art that modified amino acids include, by way of 
30 example and not by way of limitation, 4-hydroxyproline, 5-hydroxylysine, N 6 - 
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acetyllysine, tf-methyllysine, ^-dimethyllysine, ^^-trimethyllysine, 
cyclohexyalanine, D-amino acids, ornithine. Other modifications include ammo adds 
with a C 2 , C 3 or C 4 alkyl R group optionally substituted by 1, 2 or 3 substituents selected 
from halo ( eg F, Br, I), hydroxy or C1-C4 alkoxy. 

5 

Alternatively said peptide is modified by acetylation and/or amidation. 

In a preferred method of the invention the polypeptides or peptides are modified by 
cyclisation. Cyclisation is known in the art, (see Scott et al Chem Biol (2001), 8:801- 
10 815; Gellerman et al J. Peptide Res (2001), 57: 277-291; Dutta et al J. Peptide Res 
(2000), 8: 398-412; Ngoka and Gross J Amer Soc Mass Spec (1999), 10:360-363). 

According to a further aspect of the invention there is provided a vector as a delivery 
means for antisense or an RNAi molecule which inhibits Cizl or variants thereof and 
15 thereby allows the targetting of cells expressing the truncated protein. 

Preferably the vector includes an expression cassette comprising the nucleotide sequence 
selected from the group consisting of; 

a) the nucleic acid sequence which encodes Cizl amino acid sequence as shown 

20 in Fig 8a or 8b; 

b) a nucleic acid molecule which hybridizes to the nucleic acid sequence of (a) ; 

c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate 
because of the genetic code to the sequences in a) and b) and any sequence which 
is complimentary to any of the above sequences; 

25 d) a nucleic acid sequence that encodes Cizl pre-mRNA (i.e., the genomic 

sequence) 

wherein the expression cassette is transciptionally linked to a promoter sequence. 

Preferably the vectors includuig the expression cassette is adapted for eukaryotic gene 
30 expression. Typically said adaptation includes, by example and not by way of 
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limitation, the provision of transcription control sequences (promoter sequences) which 
mediate cell/tisane specific expression. Ttase promoter sequences may be cefiWe 
specific, inducible or constitutive. 

Promoter elements typically also include so called TATA box and PNA polymerase 
nation selection sequences which function to select a site of transcription initiation. 
These sequences also bind polypeptides which function, inter alia, to facilitate 
transcription initiation selection by RNA polymerase. 

Adaptations also include the provision of selectable markers and autonomous replication 
sequences which both facilitate the maintenance of said vector in either the eukaryotic 
cell or prokaryotic host. Vectors which are maintained autonomously are referred to as 
episomal vector, Further adaptations which facilitate the expression of vector encoded 
genes include the provision of transcription termination sequences. 

These adaptations are well known in the art. There is a significant amount of published 
literature with respect to expression vector construction and recombinant DNA 
techniques in general. Please see, Sambrook et al (1989) Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbour Laboratory, Cold Spring Harbour, NY and 
references therein; Marston, F (1987) DNA Cloning Techniques: A Practical Approach 
Vol in IPX Press, Oxford UK; DNA Cloning: F M Ausubel et al, Current Protocols in 
Molecular Biology, John Wiley & Sons, Inc.(1994). 

According to the present invention there is provided a diagnostic method for the 
identification of proliferative disorders comprising detecting the expression of the Ciz 1 
gene and mutations in the genomic sequence. 

Preferably said diagnostic method comprises the steps of : 
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contacting a sample isolated from a subject to be tested with an agent which 
specifically binds a polypeptide with Ciz 1 activity or a nucleic acid molecule 
encoding a polypeptide with Ciz 1 activity; and 

detecting or measuring the binding of the agent on said polypeptide or nucleic 
acid in said sample. 

In one embodiment, the diagnostic method of the present invention is carried out in-vivo. 

Preferably the diagnostic method provides for a quantitative measure of Cizl RNA or 
10 protein variants in a sample. 

In one embodiment of the invention there is provided the use of an agent which 
modulates Cizl RNA or protein, or variants thereof, as a pharmaceutical. 

15 Preferably said pharmaceutical comprises an agent identifed by the screening method of 
the present invention and a pharmacetically acceptable carrier, excipient or diluent. 

Preferably said pharmaceutical is for oral or topical administration or for administration 
by injection. 

20 

In a further preferred embodiment of the invention there is provided the use of an agent 
according to the invention for the manufacture of a medicament for use in the treatment 
of proliferative disease. Preferably said proliferative disease is cancer. 

25 Preferably said cancer is a paediatric cancer and is selected from the group consisting of; 
retinoblastoma, neuroblastoma, Burkitt lymphoma, medulloblastoma. 

In an alternate embodiment the disease is liver cancer or metastasis. 
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According to a further aspect of the invention there is provided a method to treat a 
proliferatvie disease comprising administering to an animal, preferably a human, an 
agent obtainable by the method according to the invention. 

5 Preferably said proliferative disease is cancer and is preferably a paediatric cancer 
selected from the group consisting of ; retinoblastoma, neuroblastoma, Burkitt 
lymphoma, medulloblastoma. 
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In an alternative embodiment the disease is liver cancer or metastasis. 

According to an alternate aspect of the invention, there is provided the use of an agent 
according to the invention for the manufacture of a medicament to slow cell division or 
growth. 

15 The invention also includes the use of the Cizl amino acid sequence and protein 
structure in rational drug design and the use of Cizl or its variants or derived peptides 
thereof for screening chemical libraries for agents that specifically bind to Cizl . 

An embodiment of the invention will now be described by example only and with 
20 reference to the following figures: 

Fig. 1 Illustrates the effect of cyclin A-cdk2 on late Gl nuclei. A) DNA synthesis is 
activated by recombinant cyclin A-cdk2. Late Gl nuclei (harvested 17 hours after 
release from quiescence) were incubated in mid-Gl phase (15 hour) cytosolic extract 

25 supplemented with recombinant cyclin-dependent kinases at the indicated 
concentrations. The number of replicating nuclei increased in the presence of cyclin A- 
cdk2 (but not cyclin E-cdk2), within a narrow range of concentrations. Control bar 
shows the number of nuclei in this population that are already in S phase (unshaded) and 
the fraction that are induced to replicate by S phase extract (shaded). B) Detection of 

30 mouse p85. Asynchronous 3T3 cells were separated into soluble and insoluble fractions 
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by hypotonic lysis and centrifugation. Anti-human Cdc6 antibody VI reacts with mouse 
Cdc6 and a second antigen in the 85kDa range, that is present in both fractions. C) Late 
Gl nuclei (17 hour) were incubated in mid-Gl extract in the presence of recombinant 
cyclin A-cdk2 (as indicated). After 15 minutes nuclei were washed and the chromatin 
5 fraction was isolated and separated by SDS-Page. P85 accumulates in the chromatin 
fraction in the presence of cyclin A-cdk2, peaking at the same concentration as initiation 
of DNA replication. Chromatin bound Mcm3 is shown as a control. 

Fig. 2 Illustrates the embryonic form mouse Cizl protein aligned with the predicted full- 
10 length form. Embryonic form mouse Cizl (ECizl) lacks three sequence blocks 
compared to the predicted full-length form Blue stars indicate amino acids which have 
been targeted by site-directed mutagenesis. 

Fig. 3 Illustrates Cizl splice variants. A) Summary of mouse Cizl variants. In addition 
15 to the three variant sequence blocks (encoded by exons 1 and 2, and part of exons 6 and 
8) that were identified by analysis of ECizl, Cizl from mouse ES cells lacks a fourth 
region which corresponds to exons 3 and 4. Grey bar represents unchecked and 
potentially variable sequence. B) Human Cizl is also alternately spliced with variability 
occurring in the same four regions. Human Cizl variants are all derived from pediatric 
20 cancers or from embryonic source. Grey bar represents uncertain sequence that may be 
derived from an alternate reading frame. C) Similarity between mouse and human 
variable exons. Identical amino acids are shown in red. 




Fig. 4 Illustrates ECizl promotes initiation of mammalian DNA replication. A) In vitro 
25 replication reactions containing late Gl nuclei and S phase cytosolic extract were 
supplemented with purified recombinant ECizl. Histogram shows the average number 
of replicating nuclei with and without ECizl, and. standard deviations. In vitro initiating 
nuclei (black) are shown above the 'background' level of S phase nuclei in this 
population (unshaded). Images show nuclei incubated in the absence (i) or presence (ii) 
30 of ECizl (yellow). Total nuclei are counterstained with propidium iodode (red). B) The 
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« recombinant ECizl is concentration dependent ^ a sharp optnnum » the 
„M range. In Una experiment, and those shown in figs 4C and D, and in figs 6A, B and 
C resnlts are expressed as % initiation ratoer tar % reptication. This is ealcnlated from 
,he nnmber of nnelei tat initiate in vitro andthe number of nnclei that are •competent 
,„ initiate in yUro (see methods). C, The repUeation mnotion of ECizl resides > the N- 
^al fragment, N-term442 (shown in fig. 5B). D) No initiation activity is detected . 
the C-terminal fragment C-term274 (shown in fig. 5B), when aaaayed over a broad 
range of concentrations. 

Fig 5 muatrates Monse CM protein and derived expression constructs A) Embryonic 
m „use Cizl protein showing serene, feamrea and pntative domain, Consent cdk 
phosphorylation sites mat are conserved (white stars) or no. conserve* (bUck sters) m 
hum! (Ll are indicated. The region containing me pntetive p21- rnteracfing 
^enee (white dote) is defined in monse embryonic Cizl by -*-■"»"*• 
Jested hnman Ozl conshoc. ma. interacts with p2! <*> (Miteu, e. aL, 1999). The 
positions of secures mat are absent in embryonic form Cizl are indicated by whrte 
tnang.es. B) Embryonic Cizl (ECizl) and derived construe* nsed to prodnee protem for 
repUeation assays. Numbers in parents relate te ammo-acid positions m tire 
predicted full-length form of Cizl shown in fig. 2. 

Fig 6 fflnstrates Cizl is regulated by cyclin-dependen. Kinases. A) Stimulation of 
nation by recombinant ECizl mutant T(191/2)A, in which two threonines m the 
conserved consensus cdk phoshporylation site a. amino acids 191 and 192 have been 
changed .0 alanines. Unlike ECizl, tins protein is no. subject to down-regulafion of 
activity at high concensus indicating that endogenous cdk's regulate Cizl acuvrty. 

Figure 7 Endogenous Cizl is present at tire same sites in tire nucleus where DNA 
repUeation takes place. Recombinant Ecizl was used to raise a speofic rabbr. 
polyclonal antibody that recognises endogenous hnman and mouse Cizl. The anubo^y 
, was appfieo to asynchronously growing mouse 3T3 ceUs and to human HeLa oeUs after 



detergent extraction (to remove soluble proteins) and fixation. In both cell types anti- 
ECizl reveals a punctate staining pattern that is similar to that seen with antibodies to 
DNA replication proteins. A) Dual detection of Cizl (red), and the DNA replication 
factor PCNA with monoclonal antibody PC10 (Sigma, green) reveals that almost all 

5 PCNA staining overlaps with Cizl staining in the merged image, but that additional 
regions of Cizl staining exist. An S phase cell that contains detergent resistant PCNA is 
shown. B) A monoclonal anti-human SC35 antibody (Sigma) reveals the sites at which 
RNA splicing takes place in the nucleus of HeLa cells (green). Human Cizl does not co- 
localize with SC35 in these cells. All methods are as previously described (Coverley et 

10 al 2001 or references therein). 

Figure 8 illustrates a) mouse full length cDNA sequence b)human full length cDNA 
sequence c) human full length protein sequence. 




1 5 Materials and Methods 

Cloning . A lamba tripffix 5'-stretch, full length enriched cDNA expression library 
derived from 11 Day old mouse embryos (Clonetech ML5015t) was used to infect E. 
coli Xllblue according to the recommended protocol (Clonetech). Plaques were lifted 
onto 0.45 micron nitrocellulose filters pre-soaked in lOmM IPTG (Sigma). Affinity 

20 purified antibody VI was applied to approximately 3 X 10 6 plaques at 1/1000 dilution in 
PBS, 10% non-fat milk powder, 0.4% Tween20, after blocking for 30 minutes in the 
absence of antibody. After two hours filters were washed three times with the same 
buffer and reactive plaques were visualized with anti-rabbit secondary antibody 
conjugated to horse-radish peroxidase (Sigma), and enhanced chenn-luminescence 

25 (ECL, Amersham) according to standard procedures. 43 independent plaques were 
picked but only two strains of phage survived a further three rounds of screening. These 
were converted to pTriplEx by transforming into BM25.8 and sequenced. One codes for 
mouse Cdc6 (clone P) and the other (clone L) for an unknown mouse protein that is 
homologous to human Cizl. Sequence alignments were performed with Multialin 

30 (Corpet, 1998). 
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Antibodies. Rabbi, polyclonal antibody VI was raised against an internal fragment of 
i^aTespressed hnman Cde6 eorresponding to antino-acids 195-412 and affinity 
purified against tbls fragment by standard procedures (Harlow and Lane, 1988). 

5 Constructs. pGEX-ECizl was general by inserting a 2.3Kb Smal-Xbal (blnn. ended) 
i^Tta pTripffix-clone L into the Smal site of pGex-6P-3 (Amersham). pGBX- 
N«erm442 was generated by inserting tire 1.35H, Xmal-Xhol ftagment into Xmal-Xhol 
digested pGex-tiP-3, and pGBX-Cterm274 by inserting the 0.95kb Xhol ftagment mto 

10 Xhol digested pGex-6P-3. 

pGEX-T(191/2)A was generated from pOEX-ECizl by site direeted mutagenesis 
(Stratagene Qnikehange 200518-5) of the conserved consensos edk phosphorytauon site 
amino-acids 191 and 192 nsing primers 

15 AACCCCCTCTTCCGCCGCCCCCAATCGCAAGA m 
TCTTGCGATrGGGGGCGGCGOAAGAGGGGGTr, prodncing two alanmes m place 
of two threonines. pGEX-T(293)A was generated from pGEX-ECizl by site directed 
agenesis using primers AAGCAGACACAGGCCCCGGATCGGCTGCCT and 
AGGCAGCCGA TCCGGGGCCTGTGTCTGCTT changing the threomne at 293 to 

20 alanine. m j-moi 

All clones were verified by DNA sequence analysis, prior to transfer mto K coll BL21 

for protein expression. 

^^nressionReeombinant CM, Cizl fragments and point mutants were produced 
25 in BL21-pLysS (Stratagene) as glutathione S-transferase-tagged protein. Tins was 
purified from sonicated and cleared bacterial lysates by binding to glutathione sepharose 
4B (Amersham). Recombinant protein was eluted by cleavage from the GST tag usmg 
precision protease (as recommended by the manufacturer, Amersham), into buffer 
(50mM Tris-HC pH 7.0, ISOmM NaCl, ImM DTT). This yielded protein preparafions 
30 between0.2and2.0n^ml.ForrepUcati ra assaysserialdflntionswe re madeurl00mM 
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Hepes P H 7.8, ImM DTT, 50% glycerol so that not more thanl^l of protein solution 
added to 10|il replication assays, yielding the concentrations shown. 



was 



CelUynctoony Mouse 3T3 cells were synchronized by release from quiescence as 
previously described (Coverley et aL, 2002). Nuclei were prepared from cells harvested 
between 16 and 18 hours after release, yielding population containing S phase nuclei, 
replication competent late Gl nuclei and unresponsive early G1/G0 nuclei, in varying 
proportions. Recipient, mid-Gl 3T3 extracts were prepared from cells harvested 15 
hours after release from quiescence (these typically contain approximately 5% S phase 
cells). S phase extracts were prepared from HeLa ceUs released for two hours from two 
sequential mymidine-induced S phase blocks, as outlined (Krude et aL, 1997). HeLa 
cells are used for S phase extracts because tbey are easily synchronized in large 
quantities. 

15 ESBHcation assays Nuclei and extract preparation has been previously described 
(Coverley et aL. 2000; Krude et aL, 1997; Stoeber et aL, 1998). In vitro replication 
assayed were performed as described (Coverley et aL, 2002). Reactions containing lOjil 
of extract (supplemented with energy regenerating system and nucleotides including 
biotinylated dUTP), and 5xl0 4 nuclei were incubated with Cizl at the indicated 

20 concentrations, for 60mins at 37°C. Reactions were stopped by the addition of 50ul of 
0 5% triton X100 and fixed by the addition of 50 pi of 8% paraformaldehyde. After 
transfer to coverslips nuclei were stained with streptavidin-FITC (Amersham) to reveal 
DNA synthesized in vitro, and counterstained with Toto-3 -iodide. The proportion of 
labeled nuclei in each sample was quantified by inspection at 1000X magnification. 

25 Nuclei with fluorescent foci were scored positive. Images of in vitro replicating nuclei 
were generated by confocal microscopy at 600X magnifications, of samples 
counterstained with propidium iodide. 

EMS a^bgis and nresentation Prior to use in initiation assays each preparation of 
30 synchronized Gl phase nuclei is tested so that the proportion of nuclei that are already in 
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8 phase is established C%S'). To do Ms nuclei are incubated in an extract that . 
capable of indncing initiation of DNA syndesis (from mid-Gl phase ceUs harvest 
15 hours after releaae from quiescence), but that will efficiently support elongation DNA 
^thesis from origius tat were. initiated in vivo. The elongating traction of nucle. 
S incorporates labeled nucleotides efficiently during in vitro initiation assays but is 
uuinformafive. Routinely (bis taction is pre-established and subtracted from tire raw. 
date. Synchronized populations in which 20% or less are in S phase are used for 
initiation assays. 

0 Similarly, for each preparation of nuclei tire fraction mat is capable of responding to 
inducers of S phase and initiating DNA synthesis in vitro (the competent fraction) .s also 
pre-established This is taken to be the maximum proportion of nuclei mat replicate m 
vitro under any conditions, and is usually around 50% of the total population. 

IS When 3T3 cells are released from quieaconce by the protocol used here a maximum of 
" 70% of the total population enters S phase in vivo (Coverley et al., 2002) while tire rest 
do no. re-enter the c*U cycle. This defines me maximum possible initiation frequency at 
around 70%. For most populations this is nearer 60%, so the proportion of Gl nucle, that 
replicate in vitro (around 50%) is comparable to the proportion that would have entered 
20 S phase in vivo. 

The late Gl population of 3T3 nuclei used in the replication experiments shown here 
was harvested 17 hours after release from quiescence. 17% of these were already » 8 
phase (fig. 4A) and the maximum number that have replicated in any assay in y*» is 
25 50%. Therefore, 33% of this total population is in late Gl and competent to rephcate 
(o/oC). For each data point the initiating fraction <•% initiation') is calculated using AS 
and %C according to the formula, % Initiation = (% replication -%S)/%C. 



21 



o 




10 



Results 

Identification of Ciz lL ate Gl 3T3 nuclei (isolated 17 hours after release from 
quiescence) can be induced to begin DNA synthesis by recombinant cyclin A-cdk2 
((Coverley et al., 2002) and fig. 1 A). The response to cyclin A-cdk2 is strictly dependent 
upon the concentration of active kinase that is added to the recipient extract (from mid- 
Gl phase cells harvested 15 hours after release from quiescence). When nuclei are 
incubated under these conditions a number of changes to chromatin bound proteins can 
be seen. We have identified of one of these proteins by expression cloning. 



Our previous work on replication complex assembly proteins generated a polyclonal 
antibody against human Cdc6 (VI), which has been used in previous studies (Coverley 
et al., 2000; Stoeber et al., 1998; Williams et al., 1998). When applied to material 
derived from mouse 3T3 cells antibody VI reacts with Cdc6 and another protein of 
15 approximately 85kDa (fig. IB). Antibody VI has been used to follow p85 through in 
vitro replication complex assembly and activation reactions. Strikingly, p85 fractionates 
with chromatin at the same cyclin A-cdk2 concentration that activates DNA synthesis in 
vitro and becomes resistant to extraction (fig. 1C). This implicates p85 in initiation of 
DNA replication. 

20 

When applied to a cDNA expression library derived from 11-day mouse embryos 
antibody VI picked out two clones that survived multiple rounds of screening. One 
encodes mouse Cdc6, while the other encodes 716 amino acids of an uncharacterized 
mouse protein, homologous to human CM. Human Cizl (Accession Number 

25 AB030835), the predicted mouse Cizl cDNA sequence (XM123748) and a. full-length 
cDNA clone derived from a mouse mammary tumour library (No. BC018483) all 
contain three additional sequence blocks that are not present in our embryonic mouse 
Cizl clone (fig. 2A). These correspond to exons 1 and 2, and parts of exons 6 and 8 of 
the mouse gene. Our analysis has focused on the activity that resides in the embryonic 

30 form of mouse Cizl (ECizl) that lacks these exons (fig. 2). 
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TTuman tnm ours express Cizl variants that resemble mouse emb ryonic Cizl Human 
Cizl also exists as multiple isoforms, the functions of which are unstudied. The cDNA 
encoding, the predicted full-length form (XM 026951) was isolated from a B-cell library 
5 (Mitsui et al., 1999), while a number of variant sequences have been found by large- 
scale genome analysis projects (HGMP, IMAGE, NCBI, HEMBA). Cizl variants 
(AK023978, BC004119, BC021163, AF234161, AK027287) lack one or more of the 
same exons that are absent in mouse embryonic Cizl. Furthermore, they are all derived 
from paediatric cancers (retinoblastoma, neuroblastoma, Burkitt lymphoma, 
10 medulloblastoma) or from a human embryo. The data on Cizl isoforms has been 
collated and is summarised in fig.3. 

Cizl stimulates initiation in vitro Recombinant ECizl protein stimulates initiation of 
DNA replication in late Gl 3T3 nuclei (18 hour), when used to supplement extracts from 

15 S phase HeLa cells (fig. 4A). Upon addition of Ecizl the number of nuclei that 
replicated in vitro was increased from 30%(+/-0.856) to 46% (+/- 5.52). This increase is 
in addition to the 13% of nuclei that initiate replication in response to S phase cytosol 
alone, and the 17% that are already in S phase. When combined with the observation 
that p85-Cizl associates with chromatin under the same conditions that support initiation 

20 in vitro, this result argues strongly for a role for Cizl in initiation of mammalian DNA 
replication. 

Stimulation of initiation is sensitive to the concentration of recombinant ECizl, with 
peak activity at around 2 nM (fig. 4B). This concentration dependent response to ECizl 
25 echoes the results of previous experiments in which recombinant cdks were added to 
mammalian cell-free replication systems (Coverley et al., 2002; Krude et al., 1997) and 
fig. 1 A). In all of these experiments stimulation of initiation is lost when an excess of 
active protein is added. 
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• r^.n^lMtortMU nMa ECizl possesses several sequence features rn rts C 
tenm nal third .hat suggest possible Amotions (fig. 5A). The matrin 3 domain imphes 
interaction with fire nuclear matrix and fire fine. C2H2 type zinc-fingers suggest 
interaction with DNA or RNA Here, we report that the replication function of EOzl 

5 does no, require these domains. N-.em.442 (fig. 5B), which lacks fire matrin fiuee 
domain and two of the zinc fingers, stimulates initiation in the same way as mtact EOzl 
(fig 4C) The proportional increase in initiation and the active concentration are 
essentially ttte same for bom forms of the protein. Furti.em.ore, the C-terminal portion, 
C-tenn274 contains no residual replication activity in our assay (fig. 4D). These data 

10 show that, when assayed in trans, tire nucleotide and matrix interaction domains do not 
contribute to ECizl repUcation fimction. I« remains possible however, that these domams 
play a role in localizing endogenous Cizl to specific sites in tire nucleus in vivo. 

15 sequences .hat conform to me consensus for cdk phosphorylation, suggesting mat cychn 
dependent kinases regulate Cizl fimction (fig. 5A). Two of these sites are conserved » 
human and mouse Cizl. One of these is in the C-temtinal portion of Cizl, which K no 
paired for initiation function, while me remaining site is located in the N-temnnal 
portion adjacent to the site at which exon 6 is alternately spliced We have mutated tins 
20 ctik site in ECizl, changing two threonines at 191 and 192 to two abnh.es, generating 
ECizlT(191/2)A. Like ECizl, recombinant ECizlT(191/2)A stimulates mrtiation of 
DNA replication in late Gl nuclei (fig. 6A). It does tins «o a similar extent and a, a 
similar concentration as ECizl, however unlike ECizl tire response to ECizlT(191/2)A 
is no, down regulated when it is added a, high conoentiations. This demonstrates that tire 
25 stimulatory effect of ECizl on initiation of DNA repUcation is regulated by cyclm- 
dependen. kinases. Furthermore, it suggest a sensing mechanism mat monitors me level 
of Cizl, which is capable of distinguishing functional levels (nM range) from excess (10 
nM range). 
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The downturn in the response curve for ECizl (fig. 4B) returns initiation to the level 
achieved with S .phase extract alone (in the absence of added Cizl), rather than to the 
level achieved with Gl phase extract (which only supports elongation DNA synthesis). 
Therefore, negative regulation of ECizl dependent initiation does not affect all initiation 
5 events, only those influenced by recombinant ECizl . 

The consensus cdk phosphorylation site at amino acid 293 was also mutated with no 
affect on replication activity (fig. 6C). 

Cizl colocalizes with the replication factor PCNA, but not the splicing factor SC35, 
10 indicating that Cizl is present at the sites where DNA replication takes place (See Figure 



15 Embryonic form Cizl (Ecizl) positively regulates initiation of mammalian DNA 
replication and is found at site of DNA replication inside the nucleus of cells. Its 
chromatin binding and replication function is regulated by cdk2. It may also modulate 
cdk2 activity via p21. This demonstration of a positive role in DNA replication initiation 
combined with the embryo and cancer-derived splice variants suggest the following 

20 working hypothesis. 

Replication activity resides in the embryonic form of Cizl (proven). Inclusion of 
additional exons by a regulated splicing mechanism during the course of normal 
development, adds functions that are not present in the embryonic form. This could 
25 selectively restrain the initiation function of Cizl in response to external signals. 
Expression of embryonic forms of Cizl at the wrong time in development could lead to 
cancer. 

For example, one of the variable exons encodes a short conserved DSSSQ sequence 
30 motif that is absent in mouse ECizl and in a human medulloblastoma. This is directly 
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adjacent to the consensus cdk phosphorylation site that we have shown to be involved in 
regulation of ECizl function. Conditional inclusion of the DSSSQ sequence might make 
Cizl the subject of regulation by the ATM/ATR family of protein kinases, which 
phosphorylate proteins at SQ sequences, thereby restraining Cizl initiation function m 
5 response to DNA damage. 

in summary, variable axon usage, amino acid sequence and overall gene structure are 
highly conserved between mouse and human Cizl, and forms derived from early m 
development (ot from cancer cells) have fewer axons than differentiated cells. The 
^notions of these variable exons are unstudied and it is possible mat errors in CI 
splicing during development could contribute to the formation or progression of cancer 
cell lineages by allowing Cizl to escape regulatory pathways and promote DNA 
replication at the wrong time. 
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Ciz1 replication function is negatively regulated by cyclin-dependent kinase 
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Figure 8a; Mouse full length cDNA sequence 

CATGTTCAAC CCGCAACTCC AGCAGCAGCA ACAGTTGCAG CAGCAGCAGC 
AACAGTTGCA GCAGCAGCTC CAGCAGCAGC AGCTCCAGCA GCAGCAACAG 
CAGATACTGC AGCTCCAACA GCTGCTGCAA CAGTCCCCAC CACAGGCCTC 
CTTGTCCATT CCTGTCAGCC GGGGCCTCCC CCAGCAGTCA TCCCCGCAAC 
AGCTTCTGAG TCTCCAGGGC CTCCACTCGA CCTCCCTGCT CAATGGCCCC 
ATGCTGCAAA GAGCTTTGCT CCTACAGCAG TTGCAAGGAC TGGACCAGTT 
TGCAATGCCA CCAGCCACGT ATGACGGTGC CAGCCTCACC ATGCCTACGG 
CAACACTGGG TAACCTCCGT GCTTTCAATG TGACAGCCCC AAGCCTAGCA 
GCTCCCAGCC TTACACCACC CCAGATGGTC ACCCCAAATC TGCAGCAGTT 
CTTTCCCCAG GCTACTCGAC AGTCTCTGCT GGGGCCTCCT CCTGTTGGGG 
TCCCAATAAA CCCTTCTCAG CTCAACCACT CAGGGAGGAA CACCCAGAAA 
CAGGCCAGAA CCCCCTCTTC CACCACCCCC AATCGCAAGG ATTCTTCTTC 
TCAGACGGTG CCTCTGGAAG ACAGGGAAGA CCCCACAGAG GGGTCTGAGG 
AAGCCACGGA GCTCCAGATG GACACATGTG AAGACCAAGA TTCACTAGTC 
GGTCCAGATA GCATGCTGAG TGAGCCCCAA GTGCCTGAGC CTGAGCCCTT 
TGAGACATTG GAACCACCAG CCAAGAGGTG CAGGAGCTCA GAGGAGTCCA 
CCGAGAAAGG CCCTACAGGG CAGCCACAAG CAAGGGTCCA GCCTCAGACC 
CAGATGACAG CACCAAAGCA GACACAGACC CCGGATCGGC TGCCTGAGCC 
ACCAGAAGTC CAAATGCTGC CGCGTATCCA GCCACAGGCA CTGCAGATCC 
AGACCCAGCC AAAGCTGCTG AGGCAGGCAC AGACACAGAC CTCTCCAGAG 
CACTTAGCGC CCCAGCAGGA TCAGGTAGAG CCACAGGTAC CATCACAGCC 
CCCATGGCAG TTGCAGCCAC GGGAGACAGA CCCACCGAAC CAAGCTCAGG 
CACAGACCCA GCCTCAGCCC CTCTGGCAGG CGCAGTCACA GAAGCAGGCC 
CAGACACAGG CACATCCACA GGTACCCACC CAAGCACAGT CACAGGAGCA 
GACATCAGAG AAGACCCAGG ACCAGCCTCA GACCTGGCCA CAGGGGTCAG 
TACCCCCACC AGAACAAGCG TCAGGTCCAG CCTGTGCCAC GGAACCACAG 
CTATCCTCTC ACGCTGCAGA AGCTGGGAGT GACCC AGACA AGGCCTTGCC 
AGAACCAGTA AGTGCCCAGA GCAGTGAAGACAGGAGCCGG GAGGCGTCCG 
CTGGTGGCCT GGATTTGGGA GAATGTGAAA AGAGAGCGGG AGAGATGCTG 
GGGATGTGGG GGGCTGGGAG CTCCCTGAAG GTCACCATCC TGCAGAGTAG 
CAACAGCCGG GCCTTTAACA CCACACCCCT CACATCTGGA CCTCGCCCTG 
GGGACTCTAC CTCTGCCACC CCTGCCATTG CCAGCACACC CTCCAAGCAA 
AGCCTCCAGT TCTTCTGCTA CATCTGCAAG GCCAGCAGCA GCAGCCAGCA 
GGAGTTCCAG GATCACATGT CAGAGGCTCA GCACCAACAG CGGCTTGGGG 
AAATACAACA CTCGAGCCAG ACCTGCCTGC TGTCCCTGCT GCCCATGCCT 
CGGGACATCC TGGAGAAAGA AGCGGAAGAT CCTCCGCCCA AACGCTGGTG 
CAACACCTGC CAGGTGTACT ACGTGGGAGA CTTGATCCAG CACCGTAGGA 
CACAGGAGCA CAAGGTTGCC AAACAATCCC TGAGGCCCTT CTGCACCATA 
TGCAACCGTT ACTTCAAGAC CCCTCGAAAG TTTGTGGAGC ACGTGAAGTC 
CCAGGGACAC AAGGACAAGG CCCAAGAGCT GAAGACACTTGAAAAGGAGA 
CAGGCAGCCC AGATGAGGAC CACTTCATCA CTGTGGACGC CGTCGGTTGC 
TTTGAGAGTG GTCAAGAAGA GGACGAGGAT GACGACGAGGAAGAAGAAGA 
AGAAGGAGAG ATTGAGGCTG AGGAGGAATT CTGCAAGCAG GTGAAGCCGA 
GAGAAACATC CTCAGAGCAA GGGAAGGGCT CTGAGACGTA CAACCCCAAC 
ACAGCCTATG GTGAGGATTT CCTGGTGCCA GTGATGGGCT ATGTCTGTCA 



AATCTGTCAC AAGTTCTACG ACAGCAACTC AGAATTGCGG CTTTCTCACT 
GCAAGTCCCT GGCCCACTTT GAGAACCTGC AGAAATACAA AGCCAAGAAC 
CCAAGCCCTC CTCCTACCCG GCCTGTGAGC CGCAAGTGTG CCATCAACGC 
CCGCAACGCC CTGACTGCAC TGTTCACCTC TAGCCACCAG CCCAGCCCCC 
AGGACACAGT GAAAATGCCC AGCAAGGTGA AGCCTGGATC CCCCGGACTC 
CCTCCTCCCC TTCGGCGCTC AACACGCCTC AAAACCTGAT AGAGGGAGCT 
CTGGCCACTC AGCCTGACTA AGGCTCAGTC TGCTAATGCT TCCTAGGTAT 
CTGTGTAGAA ATGTTCAAGT GGTTGGTGTT TTTACTCAAA ATCCAATAAA 
GAGTCAGTAG TTTGGCAAAA AAAAAAAAAA AAAAAAA 



l3 



Figure 8b; Human full length cDNA sequence 

TGGGGGCTGC GGGGCCGGCC CATCCGTGGG GGCGACTTGA GCGTTGAGGG 
CGCGCGGGGA GGCGAGCCAC CATGTTCAGC CAGCAGCAGC AGCAGCTGCA 
GCAACAGCAG CAGCAGCTCC AGCAGTTACA GCAGCAGCAG CTCCAGCAGC 
AGCAATTGCA GCAGCAGCAG TTACTGCAGC TCCAGCAGCT GCTCCAGCAG 
TCCCCACCAC AGGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 
GCAGCAGCCA CAGCAGCCGC TTCTGAATCT CCAGGGCACC AACTCAGCCT 
CCCTCCTCAA CGGCTCCATG CTGCAGAGAG CTTTGCTTTT ACAGCAGTTG 
CAAGGACTGG ACCAGTTTGC AATGCCACCA GCCACGTATG ACACTGCCGG 
TCTCACCATG CCCACAGCAA CACTGGGTAA CCTCCGAGGC TATGGCATGG 
CATCCCCAGG CCTCGCAGCC CCCAGCCTCA CACCCCCACA ACTGGCCACT 
CCAAATTTGC AACAGTTCTT TCCCCAGGCC ACTCGCCAGT CCTTGCTGGG 
ACCTCCTCCT GTTGGGGTCC CCATGAACCC TTCCCAGTTC AACCTTTCAG 
GACGGAACCC CCAGAAACAG GCCCGGACCT CCTCCTCTAC CACCCCCAAT 
CGAAAGGATT CTTCTTCTCA GACAATGCCT GTGGAAGACA AGTCAGACCC 
CCCAGAGGGG TCTGAGGAAG CCGCAGAGCC CCGGATGGAC ACACCAGAAG 
ACCAAGATTT ACCGCCCTGC CCAGAGGACA TCGCCAAGGA AAAACGCACT 
CCAGCACCTG AGCCTGAGCC TTGTGAGGCG TCCGAGCTGC CAGCAAAGAG 
ATTGAGGAGC TCAGAAGAGC CCACAGAGAA GGAACCTCCA GGGCAGTTAC 
AGGTGAAGGC CCAGCCGCAG GCCCGGATGA CAGTACCGAA ACAGACACAG 
ACACCAGACC TGCTGCCTGA GGCCCTGGAA GCCCAAGTGC TGCCACGATT 
CCAGCCACGG GTCCTGCAGG TCCAGGCCCA GGTGCAGTCA CAGACTCAGC 
CGCGGATACC ATCCACAGAC ACCCAGGTGC AGCCAAAGCT GCAGAAGCAG 
GCGCAAACAC AGACCTCTCC AGAGCACTTA GTGCTGCAAC AGAAGCAGGT 
GCAGCCACAG CTGCAGCAGG AGGCAGAGCC ACAGAAGCAGGTGCAGCCAC 
AGGTACAGCC ACAGGCACAT TCACAGGGCC CAAGGCAGGT GCAGCTGCAG 
CAGGAGGCAG AGCCGCTGAA GCAGGTGCAG CCACAGGTGC AGCCCCAGGC 
ACATTCACAG CCCCCAAGGC AGGTGCAGCT GCAGCTGCAG AAGCAGGTCC 
AGACACAGAC ATATCCACAG GTCCACACAC AGGCACAGCC AAGCGTCCAG 
CCACAGGAGC ATCCTCCAGC GCAGGTGTCA GTACAGCCAC CAGAGCAGAC 
CCATGAGCAG CCTCACACCC AGCCGCAGGT GTCGTTGCTG GCTCCAGAGC 
AAACACCAGT TGTGGTTCAT GTCTGCGGGC TGGAGATGCC ACCTGATGCA 
GTAGAAGCTG GTGGAGGCAT GGAAAAGACC TTGCCAGAGC CTGTGGGCAC 
CCAAGTCAGC ATGGAAGAGA TTCAGAATGA GTCGGCCTGT GGCCTAGATG 
TGGGAGAATG TGAAAACAGA GCGAGAGAGA TGCCAGGGGTATGGGGCGCC 
GGGGGCTCCC TGAAGGTCAC CATTCTGCAG AGCAGTGACA GCCGGGCCTT 
TAGCACTGTA CCCCTGACAC CTGTCCCCCG CCCCAGTGAC TCCGTCTCCT 
CCACCCCTGC GGCTACCAGC ACTCCCTCTA AGCAGGCCCT CCAGTTCTTC 
TGCTACATCT GCAAGGCCAG CTGCTCCAGC CAGCAGGAGT TCCAGGACCA 
CATGTCGGAG CCTCAGCACC AGCAGCGGCT AGGGGAGATC CAGCACATGA 
GCCAAGCCTG CCTCCTGTCC CTGCTGCCCG TGCCCCGGGA CGTCCTGGAG 
ACAGAGGATG AGGAGCCTCC ACCAAGGCGC TGGTGCAACA CCTGCCAGCT 
CTACTACATG GGGGACCTGA TCCAACACCG CAGGACACAG GACCACAAGA 
TTGCCAAACA ATCCTTGCGA CCCTTCTGCA CCGTTTGCAA CCGCTACTTC 
AAAACCCCTC GCAAGTTTGT GGAGCACGTG AAGTCCCAGG GGCATAAGGA 
CAAAGCCAAG GAGCTGAAGT CGCTTGAGAA AGAAATTGCT GGCCAAGATG 



AGGACCACTT CATTACAGTG GACGCTGTGG GTTGCTTCGA GGGTGATGAA 
GAAGAGGAAG AGGATGATGA GGATGAAGAAGAGATCGAGGTTGAGGAGGA 
ACTCTGCAAG CAGGTGAGGT CCAGAGATAT ATCCAGAGAG GAGTGGAAGG 
GCTCGGAGAC CTACAGCCCC AATACTGCAT ATGGTGTGGA CTTCCTGGTG 
CCCGTGATGG GCTATATCTG CCGCATCTGC CACAAGTTCT ATCACAGCAA 
CTCAGGGGCA CAGCTCTCCC ACTGCAAGTC CCTGGGCCAC TTTGAGAACC 
TGC AGAAATA C AAGGCGGCC AAGAACCCCA GCCCCACCAC CCGACCTGTG 
AGCCGCCGGT GCGCAATCAA CGCCCGGAAC GCTTTGACAG CCCTGTTCAC 
CTCCAGCGGC CGCCCACCCT CCCAGCCCAA CACCCAGGAC AAAACACCCA 
GCAAGGTGAC GGCTCGACCC TCCCAGCCCC CACTACCTCG GCGCTCAACC 
CGCCTCAAAA CCTGATAGAG GGACCTCCCT GTCCCTGGCC TGCCTGGGTC 
CAGATCTGCT AATGCTTTTT AGGAGTCTGC CTGGAAACTT TGACATGGTT 
CATGTTTTTA CTCAAAATCC AATAAAACAA GGTAGTTTGG CTGTGCAAAA 
AAAAAAAAAA AAAAAAAAAA AA 




Figure 8c: Human full length protein 

MF SQQQQQLQQQ QQQLQQLQQQ QLQQQQLQQQ QLLQLQQLLQQSPPQ 
APLPM AVSRGLPPQQ PQQPLLNLQG TNSASLLNGS MLQRALLLQQLQ GL 
DOFAMP PATYDTAGLT MPTATLGNLR GYGMASPGLA APSLTPPQLATPN 
LOOFFPQ ATRQSLLGPP PVGVPMNPSQ FNLSGRNPQK QARTSSSTTPNRK 
DSSSOTM PVEDKSDPPE GSEEAAEPRM DTPEDQDLPP CPEDIAKEKRTPA 
PEPEPCE ASELPAKRLR SSEEPTEKEP PGQLQVKAQP QARMTVPKQTQTP 
DLLPEAL EAQVLPRFQP RVLQVQAQVQ SQTQPRIPST DTQVQPKLQK 
OAOTOTSPEH LVLQQKQVQP QLQQEAEPQK QVQPQVQPQAHSQGPRQ 
VOLOOEAEPLKQV QPQVQPQAHS QPPRQVQLQL QKQVQTQTYP QVHT 
OAOPSVQPQEHPPAQV SVQPPEQTHE QPHTQPQVSL LAPEQTPVW HVC 
sGLEMPPDAVEAGGGMEK TLPEPVGTQV SMEEIQNESA CGLDVGECEN 
RAREMPGVWGAGGSLKVTIL QSSDSRAFST VPLTPVPRPS DSVSSTPAAT 
STPSKQ ALQFFCYICKASCS SQQEFQDHMS EPQHQQRLGE IQHMSQACLL 
SLLPVPRDVLETEDEEPPPR RWCNTCQLYY MGDLIQHRRT QDHKIAKQSL 
RPFCTVCNRYFKTPRKFVEH VKSQGHKDKA KELKSLEKEI AGQDEDHFIT 
VDAVGCFEGDEEEEEDDEDE EEIEVEEELC KQVRSRDISR EEWKGSETYS 
PNTAYGVDFLVPVMGYICRI CHKFYHSNSG AQLSHCKSLG HFENLQKYKA 
AKNPSPTTRPVSRRCAINAR NALTALFTSS GRPPSQPNTQ DKTPSKVTAR 
PSQPPLPRRSTRLKT 
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